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1.  INTRODUCTION 


In  this  attachment  we  cover  work  on  two  aspects  of  the  research.  In  the  first  part,  we  investigate 
a  two-stage  FEC  scheme  for  video  streaming  on  wireless  networks,  that  operates  at  both  the  MAC/PHY 
layer  and  application  layer.  We  employ  header  CRC  and  FEC  at  the  MAC/PHY  layer,  and  make  a  slight 
change  so  that  packets  with  bit  errors  are  forwarded  up  rather  than  being  discarded  there.  In  the 
application  layer,  we  employ  packet  level  FEC  to  recover  dropped  packets.  We  also  present  work  on  a 
distributed  extension  of  our  FGA-FEC  that  can  work  when  there  is  no  high-speed  backbone,  i.e.  peer-to- 
peer  and  ad  hoc  networks. 


2.  FGA-FEC  FOR  WIRELESS  NETWORKS 


Current  IEEE  802.1 1  wireless  LANs  are  designed  for  reliable  data  transmission.  They  treat  classical 
data  and  multimedia  flows  alike,  even  though  these  two  kinds  of  flows  have  different  requirements.  The 
wireless  physical  (PHY)  and  media  access  control  (MAC)  layers  [5]  are  designed  to  be  as  reliable  as 
possible,  so  that  one  bit  error  in  a  packet  could  result  in  the  whole  packet  being  dropped.  However,  due 
to  the  error  resilience  features  of  many  state-of-the-art  multimedia  CODECs  and  the  utilization  of  error 
correction  strategies  at  the  application  layer,  packets  with  errors  are  still  useful  for  multimedia 
applications.  To  efficiently  protect  data  from  losses/errors  in  a  wireless  environment,  two  questions 
occur:  At  which  protocol  layer  should  the  protection  scheme  be  located?  and  How  should  the  protection 
strategies  be  deployed?  One  simple  solution  is  to  add  protection  mechanisms  at  each  protocol  layer,  as  in 
the  current  wireless  802.1 1  protocol.  However,  we  argue  that  the  layered  protocol  protection  strategy 
does  not  always  result  in  efficient  performance  for  the  delivery  of  multimedia  data,  due  to  the 
independency  of  each  protocol  layer. 

In  our  work,  we  propose  a  two-stage  FEC  scheme  with  an  enhanced  MAC  protocol  to  efficiently  support 
multimedia  data  transmission  over  wireless  LANs.  Since  only  the  application  knows  the  characteristics 
of  the  multimedia  data,  the  proposed  scheme  enables  joint  optimization  of  protection  strategies  across  the 
protocol  stack,  and  packets  with  errors  are  delivered  to  the  application  layer  for  correction  or  drop.  We 
enhance  the  MAC/PHY  layers  to  efficiently  support  multimedia  flows  by  using  both  header  CRC  and 
FEC.  We  also  slightly  modify  the  protocol  stack  so  that  it  can  deliver  packets  with  errors  from  the  MAC 
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layer  to  the  application  layer,  instead  of  just  dropping  them.  For  the  two-stage  FEC,  we  add  FEC  only  at 
the  application  layer,  but  can  correct  both  application  layer  packet  drops  and  MAC/PHY  layer  bit  errors. 

Our  proposed  scheme  has  the  following  characteristics:  network  efficiency  through  enhanced  MAC 
protocol  using  header  CRC  and  FEC  to  improve  application  layer  effective  throughput  and  protection 
efficiency  through  unequal  error  protection  that  is  easily  deployable,  since  we  only  process  FEC  at  the 
application  layer.  Furthermore,  the  proposed  scheme  combines  bit-level  protection  codes  (good  at 
random  bit  error  correction)  and  symbol  level  codes  (powerful  at  correcting  burst  losses)  to  correct  both 
bit  errors  at  MAC/PHY  layers  and  packet  losses  at  the  application  layer. 

2.1  System  Overview 

The  proposed  system  diagram  is  shown  in  Fig.  1. 


Figure  1.  System  diagram  of  the  proposed  two-stage  protection  scheme 

At  the  application  layer,  two-stage  FEC  is  applied  to  the  encoded  video  bitstream  based  on  network 
conditions.  In  stage  1,  packet-level  FEC  is  added  across  application  layer  packets  to  correct  packet  drops 
due  to  congestion  or  route  disruption.  Stage  2  is  processed  within  each  application  packet,  where  a  small 
amount  of  bit-level  FEC  is  added  to  recover  bit  errors  from  the  MAC/PHY  layers  for  each  packet.  At  the 
receiver  side,  we  first  process  the  bit-level  FEC,  so  the  bit  errors  from  the  MAC/PHY  layers  can  be 
recovered.  Then  we  pass  the  bitstream  to  the  stage  1  FEC  decoder  for  further  correction.  In  our  work,  we 
chose  Reed-Solomon  (RS)  codes  for  packet-level  protection  (stage  1)  and  BCH  codes  for  bit-level 
protection  (stage  2). 


2.1.1  Enhanced  Protocol  Stack 

To  efficiently  support  multimedia  applications,  we  slightly  modify  the  protocol  stack  so  that  it  can  deliver 
packets  with  errors  to  the  application  layer.  This  can  be  achieved  by  simply  turning  off  the  CRC 
checksum  function  in  the  MAC/PHY  layers.  The  UDP-lite  [6]  protocol  should  be  used  at  transport  layer 
to  match  the  enhanced  MAC  protocol.  To  ensure  better  delivery  and  to  improve  the  effective  application 
layer  throughput,  we  enhanced  the  MAC/PHY  layer  by  modifying  the  802.1 1  packet  CRC  mechanism  to 
check  only  the  header  part,  possibly  also  with  bit-level  FEC  for  the  header  part. 

Header  FEC  Only  Header  CRC/FEC  Headers 


mL 


FEC 

Payload 

APP 

UDP 

IP 

MAC 

Figure  2.  Enhanced  MAC/PHY  layer  with  Header  CRC  and  Header  FEC 

The  header  part  of  each  protocol  layer  is  crucial,  because  if  the  header  has  some  errors  in  it,  usually  the 
whole  packet  is  useless.  We  use  header  CRC  and  header  FEC  to  enhance  the  MAC/PHY  layers  to 
efficiently  support  multimedia  delivery.  We  slightly  modified  the  802.1 1  MAC/PHY  layer  packet  CRC 
mechanism  to  check  if  there  is  something  wrong  within  the  header  part  as  shown  in  Fig.  2.  The  whole 
packet  is  dropped  if  the  header  CRC  or  FEC  fails. 


2 


2.1.2  Two-stage  FEC  Scheme 


Packet  losses  in  a  wireless  channel  can  be  roughly  categorized  into  two  types:  (a)  packets  dropped  due  to 
routing  disruption  or  congestion  at  the  intermediate  nodes,  and  (b)  packets  discarded  at  the  MAC/PHY 
layers  due  to  internal  bit  errors.  A  two-stage  FEC  scheme  is  shown  in  Fig.  3. 

Packet-level  FEC,  RS(N,K) 


Stage  1  Data 

Data 

Data 

Stage  2 

_ \  F 

Data 

Data 


BCH(n,k,t) 


FEC 

[ 

\ 

FEC 

1 

Figure  3.  Detail  of  the  proposed  two-stage  FEC  scheme 

In  stage  1,  packet  level  FEC  is  added  across  application  layer  packets  to  correct  packet  drops  due  to 
congestion  or  route  disruption.  In  stage  2,  FEC  is  processed  within  each  application  packet,  and  a  very 
small  amount  of  bit-level  FEC  is  added  to  recover  any  bit  errors  from  the  MAC/PHY  layers.  We  use  BCH 
codes  for  stage  2. 


2.1.3  Residual  packet  loss  probability 

We  compare  the  protection  performance  of  our  proposed  schemes  (Two-stage  FEC  +  header  CRC/FEC) 
with  conventional  application  layer  FEC  (RS  only  +  802.1 1)  in  terms  of  residual  packet-error  rate.  The 
number  of  MAC-layer  retransmission  times  is  set  to  one  for  all  three  schemes.  Any  bit  error  in  a  packet 
after  FEC  correction  will  result  in  the  packet  being  dropped.  This  is  comparable  to  the  situation  in 
conventional  802.1 1  error-free  delivery.  The  parameter  setup  is  given  in  Table  1.  The  packet  payload 
and  packet  header  size  are  1000  bytes  and  60  bytes,  respectively.  For  RS  only,  we  add  FEC  using  an  RS 
code  across  packets  with  code  rate  239/255.  For  IEEE  802.1 1,  we  follow  the  802.1 1  wireless  LAN. 
Regarding  two-stage  FEC,  we  use  RS(255,  245)  as  stage  1  FEC  and  across  the  application  layer  packets. 
The  BCH(8191,  8000,  14)  code  is  applied  within  each  application  layer  packet  as  stage  2.  Two-stage 
FEC  with  the  header  FEC  scheme  uses  the  same  FEC  for  stage  1  and  stage  2  for  header  CRC,  but  uses 
BCH(51 1,  502,  1)  as  a  protection  method  for  the  header  part  as  shown  in  Fig.  2.  The  proposed  two-stage 
FEC  scheme  significantly  outperforms  conventional  802.1 1  plus  application-only  protection  strategy  as 
shown  in  Fig.  4. 


Protection  Method 

FEC  codes 

Code  rate 

802.11 

SW-ARQ 

1  retransmission 

RS  only 

RS(255,239) 

239/255 

Two-stage  FEC 
with  header  CRC 

BCH(8 191 ,8000, 1 4)  +  RS(255,245) 

239/255 

Two-stage  FEC 
with  header  FEC 

BCH(8 191 ,8000, 1 4)  +  RS(255,245) 
+BCH(5 11,502,1) 

239/255 

Table  1.  Parameter  setups  for  comparison  of  several  protection  schemes 
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Figure  4.  Residual  packet-loss  probability  of  several  FEC  schemes  vs.  BER 


2.2  Simulations 

To  evaluate  the  performance  of  our  proposed  schemes  in  terms  of  effective  application  layer  throughput 
and  video  PSNR,  we  perform  several  simulations  to  compare  our  two-stage  FEC  plus  enhanced  MAC 
protocol  with  the  conventional  802.1 1  based  method.  The  network  simulator  ns-2  [7]  wireless  module  is 
used  in  this  section  and  the  simulation  topology  is  shown  at  Fig.  5.  Two  types  of  simulations  are 
performed,  single  hop  and  multihop  (2  hops  here).  In  the  single  hop  simulation,  nodel  works  is  the 
sender,  node2  is  the  receiver,  and  node3  is  idle.  There  is  no  contention  in  this  scenario.  For  the  multihop 
simulation,  nodel  works  as  sender,  node3  is  the  receiver,  and  node2  is  an  intermediate  node  that  forwards 
data  from  sender  to  received.  Contention  exists  among  the  three  nodes.  The  wireless  physical  layer 
bandwidth  is  set  to  2  Mbps.  The  bit-error  rates  in  this  section  are  all  averaged  over  many  trials  and  the 
average  bit-error  burst  length  on  the  Gilbert  channel  is  2.  In  order  to  reduce  delay  variation,  we  set  the 
maximum  number  of  MAC-layer  retransmissions  to  2.  The  retransmission  is  based  on  standard  802.1 1 
SW-ARQ.  Both  RTS  and  CTS  packets  are  exchanged  before  a  packet  transmission. 


200m  200m 


Figure  5.  our  ns-2  video  simulation  topology 


3.2.1  Application  layer  effective  throughput 


To  get  the  maximum  effective  throughput  (i.e.  error- free  throughput)  in  the  application  layer,  the 
application  layer  CBR  traffic  is  set  to  2  Mbps  from  sender  to  receiver  in  the  single  hop  simulations,  to 
saturate  the  channel.  The  packet  and  header  sizes  are  set  to  the  same  size  as  in  Section  2.1.3.  To  combat 
channel  bit  errors,  a  BCH(8191,  8000,  14)  code  is  applied  to  each  packet  in  header  CRC  and  header  FEC. 
The  packets  are  dropped  upon  BCH  decoder  failure.  For  the  802.1 1  packet  CRC  scheme,  we  directly 
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follow  the  standard  and  a  packet  CRC  is  performed  at  the  receiver.  Any  bit  error  must  result  in  the  whole 
packet  being  dropped  and  triggers  retransmissions  until  the  maximum  retransmission  time.  In  the  header 
CRC  scheme,  the  receiver  performs  a  header  CRC,  and  drops  a  packet  if  the  header  CRC  fails.  In  the 
header  FEC  scheme,  a  BCH(510,  480,  3)  code  is  applied  to  the  60  byte  header  part,  resulting  in  2 
additional  FEC  bytes.  This  code  can  correct  a  number  of  bit  errors  up  to  3  in  a  51 1  bit  codeword.  If  the 
BCH  decoder  cannot  successfully  decode  the  codeword,  then  a  retransmission  is  triggered.  In  the 
multihop  simulations,  since  there  are  contentions  among  the  three  nodes,  we  reduce  the  application  layer 
CBR  traffic  to  1 .2  Mbps. 


(a)  single  hop  on  BSC  channel  (b)  single  hop  on  Gilbert  channel 


(c)  single  hop  video  PSNR-Y 


(d)  multihop  on  BSC  channel  (e)  multihop  on  Gilbert  channel 


(f)  multihop  video  PSNR-Y 


Figure  6:  Effective  application  layer  throughput  on  BSC  and  Gilbert  channel  for  different  physical  layer  BER  and 
corresponding  video  PSNR-Y 

Fig.  6  shows  the  effective  application-layer  throughput  of  the  single  hop  simulation  for  the  BSC  (Fig. 

6(a))  and  Gilbert  (Fig.  6(b))  channels,  and  multihop  simulation  on  the  BSC  (Fig.  6(d)),  Gilbert  (Fig.  6(e)) 
channels.  We  see  that  standard  IEEE  802.1 1  performs  very  poorly  at  high  bit-error  rates,  because  of  the 
error  free  delivery  design  requirement.  The  header  CRC  scheme  performs  better  than  802.1 1  due  to  its 
smaller  CRC  check  size.  With  the  help  of  header  FEC,  the  probability  of  header  error  is  greatly  reduced. 
The  degradation  of  the  curve  is  most  likely  due  to  the  ACK  error  and  RTS/CTS  failure  at  higher  bit-error 
rates. 

Given  the  effective  application  layer  throughput  in  Fig.  6(b)  and  Fig.  6(e),  we  test  the  objective  video 
performance.  We  assume  an  MC-EZBC  [3]  encoded  video  bitstream  is  sent  over  a  wireless  Gilbert 
channel.  The  sender  can  adapt  the  bitstream  based  on  channel  conditions.  The  video  sequence  is 
monochrome  Foreman  CIF,  30  fps.  The  PSNRs  shown  in  Fig.  6(c)  and  Fig.  6(f)  are  the  average  of  the 
first  100  frames  from  the  single  hop  and  multihop  simulations,  respectively.  We  notice  that  the  PSNR  for 
802.1 1  packet  CRC  reduces  to  zero  at  higher  loss  rates,  and  this  is  thought  due  to  there  not  being  enough 
bandwidth  for  transmission  of  even  the  base  layer  of  the  coded  video  bitstream.  Clearly,  we  see  better 
PSNR  using  our  enhanced  MAC  protocol  (header  CRC  and  header  FEC).  The  contention  among  the 
three  nodes  reduces  the  performance  of  the  system. 
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2.2.2  Video  Performance  with  MD-FEC 


We  further  tested  the  video  performance  of  our  proposed  scheme  using  MD-FEC  [1,2].  We  use  the  same 
FEC  bit  allocation  scheme  proposed  in  [1].  Three  kinds  of  simulations  were  performed:  single  hop, 
multihop  without  FEC  adaptation,  and  multihop  simulation  with  FEC  adaptation  (FEC  can  adapt  to 
network  conditions).  The  MC-EZBC  video  bitstream  was  first  encoded  with  MD-FEC  at  the  maximum 
bit  rate  1  Mbps.  Each  GOP  was  encoded  into  128  packets  by  the  MD-FEC  encoder  for  stage  1  FEC  and 
resulted  in  a  packet  size  of  around  500  bytes.  All  packets  are  further  encoded  with  bit-level  FEC  (stage 
2),  and  a  BCH(4195,  4000,  4)  code  is  applied  in  both  single  hop  and  multihop  simulations.  The  physical 
layer  average  bit-error  rates  for  each  GOP  are  shown  in  Figs.  7(d),  7(e)  and  7(f)  for  the  Gilbert  channel. 
The  corresponding  PSNR  of  each  GOP  is  shown  above  each  BER  graph  in  Fig.  7.  The  protection 
schemes  compared  are  802.1 1  packet  CRC,  header  CRC,  and  header  FEC,  all  with  two-stage  FEC. 

Since  there  is  almost  no  contention  in  the  single  hop  simulation,  the  packet  loss  is  most  likely  caused  by 
bit  errors  in  the  wireless  channel.  We  see  a  dramatic  performance  drop  in  the  802.1 1  and  header  CRC 
schemes  at  severe  bit-error  rate  (1  x  1(T3)  in  Fig.  7(a).  This  matches  very  well  with  the  trend  in  Fig.  6(b), 
where  802.1 1  has  less  bandwidth  even  than  required  for  the  video  base  layer,  and  the  header  CRC  scheme 
can  only  accept  the  video  base  layer.  In  the  multihop  simulation  without  FEC  adaptation,  node2  works  as 
an  intermediate  node  to  forward  packets  to  node3,  both  nodel  and  node2  are  senders,  and  further  node2  is 
also  a  receiver.  In  Fig.  7(b),  the  MD-FEC  encoded  video  bitstream  is  fixed  at  1  Mbps.  The  wireless 
channel  is  time  varying  and  error  prone,  therefore,  the  stage  1  MD-FEC  design  is  based  on  a  10%  packet- 
loss  rate  and  average  error-burst  length  of  2  packets,  for  better  protection. 


Frans  Nurrte- 

(a)  single  hop  w/o  adaptation 


Frane  Number 

(b)  multihop  w/o  adaptation 


-rare  ’-JLrrter 

(c)  multihop  w/  adaptation 


Frame  Numter 


F'ar  s  Nurjsr 


(d)  BER  for  each  GOP,  single  hop  (e)  BER  for  each  GOP,  multihop  w/o  (f)  BER  for  each  GOP,  multihop  w/ 

adaptation  adaptation 


Figure  7:  Video  PSNR-Y  vs.  frame  number  at  different  channel  conditions  of  each  GOP 

Due  to  the  limitation  of  physical  bandwidth  and  high  number  of  retransmissions  at  high  bit-error  rates,  a 
large  number  of  contentions  and  packet  drops  reduces  the  effective  throughput  greatly,  and  that  results  in 
a  large  video  PSNR  drop.  Though  MD-FEC  is  very  powerful,  as  the  channel  BER  goes  high  (1  x  1(T3),  the 
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probability  of  retransmission  goes  very  high,  and  none  of  the  three  protection  schemes  work  well.  But 
still  the  proposed  header  FEC  scheme  can  transmit  part  of  the  base  layer  at  1  x  1(T3  BER.  Fig.  7(b)  also 
matches  very  well  with  Fig.  6(f).  In  Fig.  7(c)  multihop  simulation  with  FEC  adaptation,  the  FEC  design 
is  based  on  the  feedback  from  the  receiver  and  the  actual  sending  rate.  At  high  bit-error  rates,  the  sending 
rate  goes  down  and  FEC  can  be  designed  based  on  the  available  sending  rate.  The  sender  can  truncate  the 
scalable  video  bitstream  to  match  the  condition  of  the  channel.  Therefore,  comparing  to  Fig.  7(b),  all 
curves  in  Fig.  7(c)  have  better  performance  in  terms  of  video  PSNR,  especially  two-stage  FEC  with 
header  FEC,  which  performs  very  well  even  in  the  face  of  severe  channel  conditions  (lx  10~3). 

3.  DISTRIBUTED  VERSION  OF  FGA-FEC  ALGORITHM 


In  this  section,  we  investigate  a  distributed  FGA-FEC  scheme  for  scalable  video  streaming  to 
heterogeneous  users  over  a  congested  multihop  network,  where  we  do  FGA-FEC  decode/re-code  at 
selected  intermediate  overlay  nodes,  and  do  FGA-FEC  adaptation  at  remaining  nodes.  In  order  to  reduce 
the  overall  computational  burden,  we  propose  two  methods:  (1)  coordination  between  optimization 
processes  running  at  adjacent  nodes  to  reduce  the  optimization  computation,  and  (2)  extension  of  our 
overlay  multihop  FEC  (OMFEC  [8,9])  to  reduce  the  number  of  FGA-FEC  decode/recode  nodes. 
Simulations  show  that  the  proposed  scheme  can  greatly  reduce  computation,  and  can  provide  near  best 
possible  video  quality  to  diverse  users. 

Our  FGA-FEC  can  encode  scalable  video  in  such  a  way  that  both  the  embedded  bitstream  and  the  error 
correction  codes  can  be  easily  and  precisely  adapted  in  a  multidimensional  way  to  satisfy  diverse  users 
without  complex  transcoding  at  intermediate  nodes.  The  server  first  encoded  the  scalable  video  based  on 
the  highest  user  request  and  aggregated  network  conditions,  then  it  sent  the  encoded  bitstream  into  the 
network.  Inside  the  network,  the  DSNs  adapted  the  FGA-FEC  encoded  bitstream  to  satisfy  heterogeneous 
users  by  shortening  and/or  dropping  packets.  We  assumed  that  there  was  no  congestion  in  the  network 
backbone,  i.e.  that  the  backbone  available  bandwidth  was  large  enough  to  accommodate  all  user 
requirements.  This  assumption  is  for  service  provider  based  structured  networks,  where  the  congestion 
and  packet  loss  mainly  happen  at  the  edge  of  network  or  at  the  last  mile  connection.  One  problem  still 
remains:  in  a  multihop  network,  congestion  could  be  anywhere  inside  the  network,  especially 
in  an  ad  hoc  wireless  network.  How  should  we  modify  FGA-FEC  to  work  with  a  congested  back-bone? 
Here,  a  congested  link  is  defined  as  a  link  whose  available  bandwidth  is  less  than  the  minimum  required 
bandwidth  to  accommodate  a  user’s  video  request.  One  solution  to  address  this  problem  is  a  hop-by-hop 
based  solution.  We  can  optimize  FEC  protection  for  each  individual  link  and  apply  FGA-FEC  decode/re¬ 
code  at  each  DSN  for  each  user.  By  FGA-FEC  decode/re-code,  we  mean  that  a  DSN  decodes  FGA-FEC 
of  the  received  GOP,  re-optimize  the  multiple  descriptions  and  then  re-codes  the  GOP  with  new  designed 
FGA-FEC  for  its  downlinks.  This  would  be  a  heavyweight  hop-by-hop  computationally  intensive  method 
if  done  at  every  overlay  node.  Here,  we  argue  it  may  not  be  necessary  to  do  FEC  decode/re-code  at  each 
DSN. 

We  need  to  identify  the  congested  links  in  the  backbone  and  apply  the  appropriate  transformation  at  each 
DSN.  Still,  running  the  full  FGA-FEC  optimization  at  even  some  DSN  nodes  may  be  computationally 
demanding.  So,  here  we  describe  a  distributed  algorithm,  where  we  do  FGA-FEC  decode/re-code  at  the 
selected  DSNs.  The  proposed  distributed  FGA-FEC  scheme  includes  two  parts:  (1)  a  coordination  method 
between  FGA-FEC  optimization  processes  running  at  nearby  nodes  to  reduce  the  optimization 
computation,  and  (2)  we  apply  OM-FEC  [8,9]  to  reduce  the  number  of  FGA-FEC  decode/re-code  nodes, 
i.e.  we  use  FGA-FEC  adaptation  where  permitted  and  perform  FGA-FEC  decode/re-code  only  at  certain 
key  DSNs.  This  design  thus  lies  between  the  end-to-end  and  hop-by-hop  paradigms.  If  there  is  no 
congestion  over  the  backbone,  we  choose  end-to-end  FGA-FEC  scheme,  no  FEC  decode/re-code  is 
needed  at  intermediate  nodes,  but  efficient  adaptation.  If  each  backbone  link  is  congested,  it  is  a 
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heavyweight  hop-by-hop  FEC  decode/re-code  scheme.  For  this  more  advanced  distributed  algorithm,  we 
only  focus  on  SNR  scalability,  and  leave  extension  to  resolution  and  frame-rate  scalability  as  a  topic  of 
future  work. 

3.1.  DISTRIBUTED  FGA-FEC 


Figure  8.  Streaming  video  from  server  to  users  through  DSNs,  red-dotted  arrows  are  overhead  information 
flows,  black  solid  arrows  are  video  flows. 


We  outline  our  idea  in  a  simplified  example  as  shown  in  Fig.  8,  where  a  server  streams  video  to  8  diverse 
users  through  DSNs  over  a  congested  backbone.  Before  the  streaming  session,  each  end  user  sends  its 
ideal  video  request  ( Dmin  in  terms  of  distortion)  and  maximum  tolerable  distortion  ( Dmax )  to  its  directly 
connected  DSN.  During  the  streaming,  at  each  time  interval  (1  GOP  or  multiple  GOPs),  edge  DSNs 
(DSN4,  whose  downlinks  have  only  end  users)  initialize  optimization  processes  for  each  child  to  figure 
out  what  kind  of  bitstream  it  needs  to  request  from  its  parent  DSN  (DSN3).  This  request  is  based  on  its 
children’s  link  conditions  and  their  video  requests.  The  combined  video  request  of  its  child  nodes  along 
with  the  optimization  result  is  sent  to  DSN3  as  overhead  information.  DSN3  then  runs  optimizations  for 
its  own  children,  including  DSN4  (DSN3  treats  DSN4  as  one  ordinary  user),  and  generates  the  requested 
information  to  its  parent  DSN2.  This  process  is  repeated  until  we  arrive  back  at  the  server.  The  server 
then  runs  the  same  algorithms  as  DSNs  to  determine  the  amount  of  FEC  that  should  be  applied  to  the 
video  and  then  sends  the  encoded  video  into  network.  Inside  the  network,  some  selected  DSNs  will 
decode,  redo  the  FGA-FEC  design  and  recode  FEC  for  some  users,  the  other  DSNs  are  only  adaptation 
nodes.  There  are  two  kinds  of  flows  in  the  distributed  algorithm,  upstream  overhead  information  flow 
(shown  via  red-dotted  arrows  at  Fig.  8)  and  downstream  video  data  flow  (shown  via  black  arrows).  Each 
DSN  only  exchanges  optimization  information  with  its  direct  parent  or  children,  generating  only  local 
overhead  information  traffic.  The  DSNs  use  this  information  to  coordinate  optimization  processes 
running  at  nearby  nodes  to  reduce  the  computational  burden,  as  well  as  to  decide  which  nodes  that  will  be 
involved  in  the  FGA-FEC  decode/re-code.  We  apply  the  idea  of  OM-FEC  to  minimize  the  number  of 
involved  FGA-FEC  decode/re-code  nodes  while  still  maintaining  the  near  optimal 

The  FGA-FEC  optimization  algorithm  is  run  at  both  DSNs  and  video  server.  A  DSN  runs  optimization 
for  its  children  to  figure  out  what  kind  of  bitstream  it  needs  to  request  from  its  parent  DSN  or  server.  The 
server  runs  optimization  to  design  the  FEC  and  to  encode  a  GOP.  The  only  difference  in  the  optimization 
algorithms  running  at  DSNs  and  server  are  the  input  parameters.  In  this  study,  the  optimization  time 
interval  is  one  GOP.  We  defer  details  to  our  published  VCIP  2008  paper  [10]. 
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The  motivation  of  coordination  typically  is  from  the  following:  (1)  video  statistical  information  between 
adjacent  GOPs  does  not  usually  change  rapidly,  and  (2)  the  server  and  parent  DSNs  have  the  optimization 
information  from  their  child  DSNs  of  the  same  GOP,  with  only  different  available  bandwidth  B  and 
packet-loss  probability  p.  Therefore,  the  problem  can  be  simplified  into  how  to  utilize  the  previous 
optimization  information  as  network  conditions  and  video  statistics  change.  We  will  use  two 
coordination  methods:  (1)  search  with  previous  GOP  results  at  this  DSN,  and  (2)  search  with  current  GOP 
result  from  child  node.  The  edge  DSNs  (DSNs  whose  children  are  all  end  users)  initialize  optimization 
for  a  new  GOP.  There,  we  can  use  optimization  information  from  the  previous  GOP,  we  call  this  method 
“search  with  previous  GOP.”  Intermediate  DSNs  and  the  server  have  local  information  not  only  of  the 
same  GOP  from  child  DSNs  but  also  have  their  previous  GOP  optimization  result.  Thus,  they  can  use 
information  of  either  of  these  GOPs  to  initialize  their  optimization  search.  Using  the  optimization 
information  from  child  DSN,  will  be  called  “search  with  neighbor.”  We  also  consider  a  full  search 
method,  where  each  node  runs  the  optimization  algorithm  independently.  There,  the  upstream 
communication  between  nodes  is  only  the  video  request.  The  optimization  information  to  be  shared 
between  nodes  are  the  X  s  and  rate  break  points  Rx  ‘s. 

3.2.  Coordination  to  Reduce  Number  of  FGA-FEC  Decode/re-code  Nodes 

An  extreme  case  of  the  distributed  FGA-FEC  is  hop-by-hop  FGA-FEC  decode/re-code,  i.e.  do  FGA-FEC 
decode/re-code  at  each  DSN.  This  method  can  provide  the  best  possible  video  quality  for  diverse  users  in 
a  congested  backbone,  since  the  protection  is  specifically  optimized  for  each  individual  user.  One  may 
argue  that  it  is  not  necessary  to  do  the  FGA-FEC  decode/re-code  at  each  DSN,  if  only  part  of  the  network 
is  congested.  For  example,  we  already  have  shown  that  if  the  network  backbone  is  not  congested,  our 
simpler  FGA-FEC  adaptation  can  also  provide  a  near  optimal  solution  if  the  user  diversity  is  not  too 
great.  Combining  these  two  ideas  together,  we  do  FGA-FEC  decode/re-code  at  some  selected  nodes, 
while  still  providing  similar  video  quality  to  hop-by-hop  FGA-FEC  decode/re-code.  So  here,  we  apply 
our  OM-FEC  concept  to  the  network  backbone  to  divide  the  network  into  segments  and  hence  minimize 
the  number  of  FGA-FEC  decode/re-code  nodes.  We  use  the  topology  of  Fig.  8  to  illustrate  the  idea.  In 
Fig.  8,  if  there  is  no  congestion  in  the  backbone,  we  can  directly  encode  a  video  using  FGA-FEC  only  at 
the  server  and  then  use  the  simpler  FGA-FEC  adaptation  inside  the  network.  If  some  links  in  the 
backbone  are  congested,  we  need  to  identify  them  and  apply  FGA-FEC  decode/re-code  functions  at  the 
boundary  nodes  of  these  congested  links.  We  still  use  local  information  to  decide  upon  the  congested 
links. 

3.3  EXPERIMENTS  AND  SIMULATIONS 

We  did  experiments  and  simulations  to  show  the  efficiency  of  our  proposed  distributed  FGA-FEC  scheme 
using  videos  Foreman  CIF,  18  GOPs,  Mobile ,  SIF,  8  GOPs  wad  Football,  SIF,  7  GOPs,  with  16 
frames/GOP  in  all  three  sequences.  The  source  encoder  is  MC-EZBC,  N=  64.  The  proposed  scheme 
includes  two  approaches  (1)  a  coordination  method  between  optimization  processes  running  at  adjacent 
nodes  to  reduce  computation,  (2)  using  the  OM-FEC  concept  to  reduce  the  number  of  FGA-FEC 
decode/re-code  nodes  while  still  maintain  near  optimal  video  quality,  measured  in  terms  of  PSNR. 
Regarding  the  first  approach,  we  compare  the  number  of  iterations  need  to  reach  the  optimization  stop 
point  using  “full  search,”  “search  with  previous  GOP,”  and  “search  with  neighbor.”  For  the  later 
approach,  we  compare  with  hop-by-hop  FEC  decode/re-code  scheme  and  show  that  we  can  get  similar 
video  quality,  but  use  fewer  node  involved  in  FEC  decode/re-code.  Finally,  we  measured  the  CPU  time 
of  using  the  distributed  FGA-FEC  algorithm  to  show  the  efficiency. 

3.1.  Optimization  Performance 

We  solve  the  optimization  problem  using  a  bisection  search  to  find  the  best  X  value.  We  need  find  a 
stopping  criteria.  We  use  |  i?totai  ~  B\<  1  IN  x  B  and  \X  -  2previous|  <  e,  i.e.  the  total  rate  should  be  close 
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to  the  available  bandwidth  and  X  is  not  changing  much,  where  s  is  a  threshold.  Intuitively,  a  larger 
threshold  should  correspond  to  coarser  precision.  After  the  optimization,  ( N —  1  IN  x  B)  <  -fttotal  <  (N+  1/ 
N  x  B).  Ifi?total  <  5,  we  need  to  allocate  more  video  data  to  R ^  to  satisfy  i?totai  —  B.  If  i?totaj  ^  B,  we  need 
to  remove  some  video  data  from  RN  to  satisfy  Rto tai  =  B.  Experiments  show  that  s  =  1  x  10-5  is  a  very 
good  choice  in  that  the  quality  loss  is  almost  negligible. 


(a)  Channel  condition  (b)  Number  of  iterations 

Figure  9.  Dynamic  Channel  Conditions:  full  search  algorithm  vs.  our  our  proposed  “search  with  previous 
GOP”  and“search  with  neighbor,”  in  terms  of  number  of  iterations  at  a  dynamic  channel,  (a)  channel 
conditions  varying  over  GOP  number,  (b)  the  number  of  iterations  to  reach  optimal  stopping  point. 

In  Fig.  9,  we  compare  the  full  search  algorithm  with  our  proposed  “search  with  previous  GOP”  and 
“search  with  neighbor”  methods  on  a  dynamic  channel,  where  the  channel  condition  changes  over  the 
GOPs  as  in  Fig.  9(a).  The  corresponding  number  of  iterations  to  reach  the  stopping  point  for  the  three 
methods  are  shown  in  Fig.  9(b).  Here  one  iteration  is  defined  as  one  X  step  calculation.  Initially,  we  set  X 
=  1  x  10"3  in  the  “full  search”  method.  For  full  search  optimization,  the  bisection  search  starts  from  the 
initial  X  to  the  optimization  stopping  point.  In  the  “search  with  previous  GOP”  method,  the  first  GOP  is 
the  same  as  full  search,  we  start  from  an  initial  X  value  1  x  10"3  and  search  to  the  optimization  stopping 
point.  After  the  first  GOP,  we  use  the  previous  GOP  final  X  (optimal  point  value)  as  our  starting  point  to 
optimize  the  current  GOP  for  the  current  network  condition.  In  ’’search  with  neighbor,”  we  use  the  same 
GOP  information  in  previous  network  conditions  from  child  DSN.  For  “search  with  neighbor”  method,  if 
the  network  condition  does  not  change,  the  optimization  value  can  be  used  directly  without  optimization. 
From  Fig.  9,  we  see  that  if  the  channel  condition  changes,  both  “search  with  previous  GOP”  and  “search 
with  neighbor”  have  similar  performance,  but  when  channel  condition  is  statistically  consistent,  using 
“search  with  neighbor”  gains  over  “search  with  previous  GOP,”  saving  about  2  iterations  on  average. 

The  results  in  this  section  show  that  the  coordination  between  adjacent  nodes  can  greatly  reduce  the 
optimization  computation.  Full  comparisons  are  given  in  the  VCIP  paper  [10]. 

4.  CONCLUSIONS 


We  proposed  a  two-stage  FEC  scheme  with  an  enhanced  MAC  protocol  to  efficiently  support  multimedia 
data  transmission  over  wireless  LANs.  We  enhance  the  MAC/PHY  layers  to  efficiently  support 
multimedia  flows  by  using  both  header  CRC  and  FEC.  We  also  slightly  modified  the  protocol  stack  so 
that  it  can  deliver  packets  with  errors  from  the  MAC  layer  to  the  application  layer,  instead  of  just 
dropping  them.  The  proposed  scheme  combines  bit-level  protection  codes  (good  at  random  bit  error 
correction)  and  symbol  level  codes  (powerful  at  correcting  burst  losses)  to  correct  both  bit  errors  at 
MAC/PHY  layers  and  packet  losses  at  the  application  layer. 
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In  this  project,  we  also  devised  a  distributed  FGA-FEC  algorithm  for  video  streaming  to  diverse  users  on 
a  congested  network.  We  proposed  a  distributed  approach  to  greatly  reduce  the  computational  burden  of 
optimization  by  exchanging  overhead  information  between  adjacent  nodes.  We  extended  the  idea  of 
OM-FEC  to  determine  the  congested  links  and  hence  to  reduce  the  number  of  needed  FGA-FEC 
decode/encode  nodes.  Here  we  apply  FGA-FEC  adaptation  whenever  permitted  and  do  FGA-FEC 
decode/re-code  only  at  the  edge  of  congested  links.  Simulations  have  shown  the  performance  of  the 
proposed  scheme. 
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